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Abstract 

A receiver wants to compute a function of two correlated sources separately observed by two transmitters. In the 
system model of interest, one of the transmitters may send some data to the other transmitter in a cooperation phase 
before both transmitters convey data to the receiver. What is the minimum number of noiseless bits that need to be 
commimicated by each transmitter to the receiver for a given number of cooperation bits? 

This paper investigates both the function computation and the rate distortion versions of this problem; in the first 
case, the receiver wants to compute the function exactly and in the second case the receiver wants to compute the 
function within some distortion. 

For the function computation version, a general iimer bound to the rate region is exhibited and shown to be tight 
in a number of cases: the function is partially invertible, full cooperation, one-round point-to-point communication, 
two-round point-to-point communication, and cascade. As a corollary, it is shown that one bit of cooperation may 
arbitrarily reduce the amount of information both transmitters need to convey to the receiver. 

For the rate distortion version, an inner bound to the rate region is exhibited which always includes, and sometimes 
strictly, the convex hull of Kaspi-Berger's related inner bounds. 

I. Introduction 

Distributed function computation has been a long studied source coding problem in information theory. The 
point-to-point case was investigated in the context of interactive communication by Orhtsky and Roche [16] who 
derived the rate region for one-round and two-round communication using the concept of conditional characteristic 
graph defined and developed by Korner |12| and Witsenhausen [23], respectively. The generahzation to m > 1 
round communication was considered by Ma and Ishwar [15]. 

The first setting with more than one source can be attributed to Slepian and Wolf who investigated the multiple 
access configuration where a receiver wants to recover the sources perfectly [20], [2]. Later, Korner and Marton 
considered the specific setting where a receiver wants to compute the sum modulo two of two binary sources and 
derived the rate region in the specific case where the sources have a symmetric distribution [13]. This result has 
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Fig. 1. Function computation with cooperative transmitters 

been generalized for sum modulo p for an arbitrary prime number p in [7] and [26]. More recently, we derived 
the rate regions for the setting where the function is partially invertible (and arbitrary distributions) and the case 
of independent sources (and arbitrary functions) [17]. 

Except for these cases, the rate region for two sources remains an open problem in general. For instance, for the 
sum modulo two and arbitrary distributions, the best known rate region inner bound is the one obtained Ahlswede 
and Han [1]. Building on this work, Huang and Skoglund derived an achievable rate region for a certain class of 
polynomial functions which is larger than the Slepian-Wolf rate region [10], [8], [9]. Finally, a variation of the 
problem where the receiver wants to compute some subspace generated by the sources has been investigated by 
Lalitha et al. [14]. 

The cascade network configuration has been considered in the cases where there is no side information at the 
receiver by Cuff et al. [4] and where the sources form a Markov chain by Viswanathan [22]. The general case was 
recently investigated in [19]. 

The aforementioned network configurations — point-to-point, multiple access, and cascade — are special cases of 
the network configuration depicted in Fig. 1. Two sources, X and Y, are separately observed by two transmitters, 
and a receiver wants to compute a function f{X, Y) of the sources. Transmitter-X first sends some information to 
transmitter-y at rate i?o (cooperation phase), then transmitter-X and transmitter-y send information to the receiver 
at rate Rx and respectively.' This paper investigates this setting in the context of both function computation 
and rate distortion. 

The first part of the paper is devoted to function computation. The main result is a general rate region inner 
bound that is tight in a number of special cases: 

• unlimited cooperation, i.e., when transmitter-l" knows X; 

• the function is partially invertible — i.e., when X is a function of f{X, Y); 

• one and two-round point-to-point communication for which we recover the results in [16]; 

• cascade network for which we recover the results of [4], [21]; 

' There exists a similar problem with the difference that the sent message from transmitter-X to transmitter-y can be heard by the receiver too. 
This problem has been considered in the context of rate distortion problem by Kaspi and Berger [11] and in the context of function computation 
by Ericsson and Korner [6]. 
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• no cooperation: invertible function and arbitrary function, or, arbitrary function and independent sources for 
which we recover the results of [18]. 
An interesting illustration of the second case shows that the sum rate per cooperation rate, i.e.,{Rx + Ry) / Ro, can 
be arbitrarily large. 

In the second part of the paper we consider the problem where the receiver wants to recover some functions 
fi{X,Y) and f2{X,Y) within some distortions. For the special case where f\{X,Y) = X and f2{X,Y) = Y 
Kaspi and Berger [11] proposed two inner boimds. The first one is a general inner bound while the second one 
is valid and tight in the fuU cooperation case only. These boimds easily generalize to arbitrary functions by using 
similar arguments as those used by Yamamoto in [25, Proof of Theorem 1] to extend Wyner and Ziv's result [24, 
Theorem 1] from identity functions to arbitrary functions. 

Building on ideas used to estabUsh the inner bound for the function computation problem, we derive a new 
inner bound for the rate distortion problem which always includes, and in certain cases strictly, the convex huU of 
Kaspi-Berger's inner bounds [11, Theorems 5.1 and 5.4] generalized to arbitrary functions. 

The paper is organized as follows. In Section II we formally state the problem and provide some background 
material and definitions. In Section HI, we present our results in two subsections: function computation and rate 
distortion problems, and in Section IV we provide a proof sketch of our main result. 

II. Problem Statement and Preliminaries 

We use calligraphic fonts to denote the range of the corresponding random variable. For instance, X denotes the 
range of X and T denotes the range of T. 

Let X, y, T, T\, and be finite sets. Further, define 

j:Xy.y^T 

f^■.Xxy^Fi ie{l,2} 

di:TixTi^M+ ie{l,2}. 

Let {{xi, be independent instances of random variables {X, Y) taking values over X xy and distributed 

according to p(x, y). Define X = Xi, . . . , X„ and 

/(X, Y) /(Xi, Fi), y„) . 

Similarly define /i(X, Y) and /2(X, Y). 

Next, we recall the notions of achievable rate tuples for function computation and rate distortion. For function 
computation it is custom to consider asymptotic zero block error probability whereas for rate distortion it is custom 
to consider bit average distortion. 

Deiuiition 1 (Code). An {n,Ro,Rx,RY) code for the function computation problem consists of three encoding 



functions 



(^o:A'"^{l,2,..,2"^«} 
:Af"^{l,2,..,2"^^} 
ipY-.y'x {1,2,..,2"«°} ^ {l,2,..,2"-«^} 

and a decoding function 

ip : {1,2, ..,2"-^^} X {1,2,. .,2"-^^} ^7^. 
The corresponding error probability is defined as 

P(V(¥'x(X),^y(^o(X),Y)) ^ /(X, Y)). 

An (n, -Ro, Rx, Ry) code for the rate distortion problem consists of three encoding functions defined as for the 
function computation problem, and two decoding functions 

i^i : {1,2,..,2"«-} X {1,2,..,2"«-} ^ J-f z e {1,2}. 

The corresponding average distortions are defined as^ 

Edi (/i (X, Y) , (<^;f (X) , (y^o (X) , Y) ) ) = - ^ Ed, (/i (X,- , Yj ) , (<^x (X) , (y^o (X) , Y) ) 

for i G {1, 2}. In the above expression t{ji{(px{'X.),(fY{(fQ{X.),Y))j refers to the jth component of the length n 
vector (<Px (X) , (py ((^0 (X) , Y) ) . 

Definition 2 (Function Computation Rate Region). A rate tuple {Rq, Rx, Ry) is achievable if, for any £ > and 
all n large enough, there exists an {n,Ro,Rx,RY) code whose error probability is no larger than e. The rate 
region is the closure of the set of achievable rate tuples {Ro, Rx, Ry)- 

Definition 3 (Rate Distortion Region). Let Di,D2 be two non-negative constants. A rate tuple (Rq, Rx, Ry) is 
achievable with distortions Di and D2 if, for any e > and all n large enough, there exists an (n, Rq, Rx, Ry) 
code whose average distortions are no larger than Di and D2, respectively. The rate distortion region with respect 
to Di and D2 is the closure of the set of achievable rate tuples {Rq, Rx,Ry) with distortions Di and D2. 

The problems we consider are the characterizations of 

i. the function computation rate region for given function/ (a;, y) and distribution p{x, y); 

ii. the rate distortion region for given functions fi{x,y), f2{x,y), distribution p{x,y), and distortion constraints 
Di, D2. 

^We use E to denote expectation. 



Conditional characteristic graphs play a key role in function computation problem [23], [12], [17]. Below we 
introduce a general definition of conditional characteristic graph. 

Remark 1. Given two random variables X and V, where X ranges over X and V over subsets of X,^ we write 
X &V whenever P{X e y) = 1. 

Recall that an independent set of a graph G is a subset of vertices no two of which are connected. The set of 
independent sets of G is denoted by r(G). 

Definition 4 (Generalized Conditional Characteristic Graph). Let L, K, and S be arbitrary discrete random variables 
with (L, K, S) ^ p{l, k, s). Let / : 5 ^ M be a function such that H{f{S)\L, K) = 0. The conditional characteristic 
graph G]^\ji{f) of L given K with respect to the function f{s) is the graph whose vertex set is C and such that 
?i e £ and ^2 G £ are connected if for some s\,S2 € S, and k & IC 

i. p{li,k,si) ■p{h,k,S2) > 0, 

ii. /(.Sl) /(S2). 

When there is no ambiguity for the function /(s), the above conditional characteristic graph is denoted by Ghk- 

Definition 5 (Conditional Graph Entropy [16]). Given {L, K, S) ~ p{l, fc, s) and / : 5 K such that H{f{S)\L, K) = 
0, the conditional graph entropy HiG^^Kif)) is defined as 

H{G^K{f))= ^min^ I{V;L\K) 
where V — L — K refers to the standard Markov chain notation. 

111. Results 

In the first part of this section we consider the function computation problem formulation and in the second part 
of the section we consider the corresponding rate distortion formulation. 

A. Computation 

Given a finite set S, we use M(<S) to denote the collection of all multisets of Sf" Our first result is a general 
inner bound to the function computation rate region (see Definition 2). 



^I.e., a sample of y is a subset of X. 

'*A multiset of a set 5 is a collection of elements from S possibly with repetitions, e.g., if 5 = {0, 1}, then {0, 1, 1} is a multiset. 



Theorem 1 (Inner Bound - Computation). {Rq,Rx,Ry) is achievable whenever 



Ro > I{X; U\Y) 
Rx>I{V;X\T,W) 
Ry > I{U,Y;W\V,T) 
Rx+Ry> I{X, Y; V, T, W) + I{U; W\V, X, T, Y), (1) 

for some T, U, V, and W with alphabets T, U, V, and W, respectively, that satisfy 

T~U -X-Y 

V-{X,T)-{U,Y)-W, (2) 

and 

XeVe m{T{Gt,x\t,u,y)) 

{U,Y) eW eM{V{GT,u,Y\T,v)) . (3) 
Moreover, the following cardinality bounds hold 

in < iA'i+4 

|V|<(|A'|+4).|A'| + 1 

|>V| < \U\ ■ \y\ + 1. (4) 

The last part of the theorem says that the achievable rate region (1) is maximal for random variables T, V, and 
W defined over sets whose cardinalities are bounded as in (4). Note that in the graphs Gt,x\t,u,y and Gt,u,y\t,v 
the random variable T can be interpreted as a time sharing random variable over a set of conditional characteristic 
graphs. 

The rate region characterized in Theorem 1 turns out to be tight in a number of interesting cases which we now 
fist. The first case holds when the function is partially invertible with respect to X, i.e., when X is a function of 

f{X,Y). 

Theorem 2 (Partially Invertible Function). The inner bound is tight when f{X,Y) is partially invertible with 
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Fig. 2. Minimum sum-rate -Rx + Ry as a function of tlie cooperation rate Ro for the partially invertible function of Example 1 witli a = 3 
and 6 = 10. 

respect to X. In this case, the rate region reduces to 

Ro > I{X; U\Y) 
Rx > H{X\U,W) 

Ry > I{Y;W\X, U) 
Rx + Ry> H{X) + I{Y; W\U), 

for some U and W with alphabets U and W, respectively, that satisfy 

U-X-Y 

X-{U,Y)-W, (5) 

and 

Y&W& M{T{Gu,Y\x,u)) ■ (6) 
Moreover, the following cardinality bounds hold 

\U\<\X\+A 

|W|< (1^-1+ 4) -13^1 + 1. (7) 

In the following example we apply Theorem 2 to show that one bit of cooperation may arbitrarily reduce the 
minimum sum rate Rx + Ry- 

Example 1. Let a > 2 and 6 > 1 be two natural numbers. Let X be uniform over {1,2, • • • ,a}, and let Y = 
{Yi,Y2,--- ,Ya) where the Y-s,i G {1,2,-- - ,a}, are independent random variables, each of them uniformly 
distributed over {!,••• ,2''} and independent of X. The receiver wants to recover X and Yx, i.e., f{X,Y) = 



{X,Yx). 

From Theorem 2 and the fact that X and Y are independent the rate is given by 



i?o > I{X; U) 
Rx > H{X\U) 
Ry > I{Y-W\U) 
Rx+Ry> H{X) + I{Y- W\U) 

for some U and W that satisfy (5), (6), and (7). 

We evaluate the sum rate constraint. Since X is uniformly distributed, H{X) = log2(a). Now, due to the 
independence of X and Y and the Markov chain U — X — Y we have 

H{Y\U) = H{Y) = a-b. (8) 

Further, by Definition 4, for each u gU, {u, y) = (u, (yi, ?/2, • • • , Va)) and {u, y') = {u, {y[,y'2, ■ ■ , y'a)) are 
connected in Gif^Y\x,u if and only if yx v'x some 

X G Au = {x : p{u, x) > 0}. 

Hence, because W satisfies (6), conditioned on ?7 = u the maximum number of elements in an independent set 
w gW that contains vertices {u,y), y e 3^, is 2''("~l-^«l).^ Therefore, 

H{Y\W,U = u)=b-{a-\Au\), (9) 

by letting W take as values maximal independent sets. 
Equations (8) and (9) give 



rmnI{Y-W\U) = b-J2\-Au\-p{u), 
ueii 

and therefore 

Rq = log2 (a) + ^p{x,u) ■ log2 p{x\u) 

Rx + RY=iog2{a) + b-Y,\^u\-p{u) (10) 

ueu 

for any vahd choice of U. By considering all random variables U over alphabets of no more than a + 4 elements 
and that satisfy the Markov chain U — X — Y, one can numerically evaluate the minimum achievable sum rate for 
all values of Ro using the above equations. Fig. 2 shows the minimum achievable sum rate Rx + Ry as a function 
of Ro for a = 4 and b = 10. 

use |^ii| to denote the cardinality l^^ul. 
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Fig. 3. (a) Full cooperation, (b) Two-round point-to-point communication, (c) One-round point-to-point communication, (d) Cascade. 



Choosing [/ e {0, 1} in (10) such that 

p{U = Q\X = 1) = p{U = Q\X = 2) = p{U = l\X = 3) = p{U = 1|X = 4) = 1 
p{U = Q\X = 3) = p{U = Q\X = 4) = p{U = l\X = 1) = p{U = l\X = 2)=Q 

shows that 

i?o = 1 

Rx + Ry^ 2 + 2 -b (11) 

is achievable. 

When Rq — the minimum sum rate is given by 

min(i?x+i?F) = 2 + 4-6 (12) 

from [17, Theorem 3] and using the fact that the function is partially invertible and that the sources are independent.^ 
From (11) and (12) we deduce that one bit of cooperation decreases the sum rate by at least 2 • b, which can be 
arbitrarily large since b is an arbitrary natural number. 

The next three theorems provide three other cases where Theorem 1 is tight. In each of them, one of the links 
is rate unlimited. 

When there is full cooperation between transmitters, i.e., when transmitter-l" has full access to source X, the 
setting is captured by the condition Rq > H{X\Y) and is depicted in Fig. 3(a). 

Theorem 3 (FuU Cooperation). The inner bound is tight when 

Ro > H{X\Y). 



'More generally, one can easily check that the minimum sum rate without cooperation is log2(a) + a ■ b. 
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Fig. 4. Example of the rate region for a partially invertible function for Ro = and Rq = H{X\Y). 



In this case, the rate region reduces to 



Ro > H{X\Y) 
Ry > H{f{X,Y)\T) 
Rx+Ry> H{f{X, Y)) + I{X- T\f(X, y)), 



T- X -Y, 



for some T with alphabet T that satisfies 



with cardinality bound 

\T\<\X\ + 1. 

In the following example, we derive the rate region for a partially invertible function when there is no cooperation 
and when there is full cooperation. 

Example 2. Let f{x, y) = {-l)y ■ x, with A" = 3^ = {0, 1, 2}, and 

.21 .03 .12 
p{x,y)= .06 .15 .16 
.03 .12 .12 



The rate region when Rq ~ Q was derived in [17, Example 4] and is depicted by the gray area in Fig. 4. With full 
cooperation, i.e., Rq — H{X\Y) = 1.38, using Theorem 3 the rate region is the union of the gray and the black 
areas in Fig. 4. Note that the black area, which represents the difference between the two regions is non-symmetric 
with respect to X and Y, as can be expected. 

When Ry is unUmited, i.e., when the receiver looks over the shoulder of transmitter-y, the setting is captured by 
condition Ry > Ro+H{Y) and reduces to point-to-point communication as depicted in Fig. 3(c) with the transmitter 



observing X and the receiver observing Y. The rate region for this case was estabhshed in [16, Theorem 1]. 
Theorem 4 (One-Round Point-to-Point Communication). The inner bound is tight when 

Ry>Ro + H{Y) . 

In this case, the rate region reduces to 

Ro + Rx> H{Gx\y) ■ 

When condition Rx > H{X) holds, the situation reduces to the two-roimd communication setting depicted in 
Fig. 3(b). The receiver, having access to X, first conveys information to transmitter-F, which then replies. 

Theorem 5 (Two-Round Point-to-Point Conmiunication). The inner bound is tight when 

Rx > H{X) . 

In this case, the rate region reduces to 

Ro>I{X;U\Y) 
Rx > H{X) 

Ry > I{Y; W\X, U) (13) 

for some U and W with alphabets U and W, respectively, that satisfy 

U-X-Y 
X-{U,Y)-W, 

and 

Y€W€ M{T{Gu,Y\x,u)), 

with cardinality bounds 

\K\ <\X\+2 

m < 2)- 13^1+1. 

The rate region in the above case Rx > H{X) was previously established in [16, Theorem 3], and has been 
generalized in [15, Theorem 1] for m-round point-to-point communication with cardinality bounds on the alphabet 
of auxiliary variables. However in both works, the range of the auxiliary random variable W was left unspecified, 
except for the condition that [/, W, X should determine /(X, Y). By contrast. Theorem 5 specifies W to range over 
independent sets of a suitable graph. Also, the cardinality bounds are tighter with respect to the bounds derived 
in [15, Theorem 1]. 



Finally, when Rx = there is no direct hnk between transmitter-X and the receiver and the situation reduces 
to the cascade setting depicted in Fig. 3(d). The rate region for this case was estabhshed in [4, Theorem 3.1] (see 
also [21, Theorem 2]). 

Theorem 6 (Cascade). The inner bound is tight when 

Rx=0. 

In this case, the rate region reduces to 

Rq > H{Gx\y) 
RY>H{f{X,Y)). 

B. Rate Distortion 

Theorem 1 gives an inner bound to the rate distortion problem (see Definition 3) with zero distortions when 
both distortion functions are the same. It turns out that this inner bound is in general larger than the rate region 
obtained by Kaspi and Berger in [11, Theorem 5.1] for zero distortions. The reason for this lies in Kaspi and 
Berger's achievable scheme which their inner bound relies upon. For any distortions their scheme implicitly allows 
the receiver to perfectly decode whatever is transmitted from transmitter-X to trans mitter-F. By contrast, we do not 
impose this constraint in the achievability scheme that yields Theorem 1. More generally, by relaxing this constraint 
it is possible to achieve an achievable rate region that contains, and in certain cases strictly, the rate region given 
by [11, Theorem 5.1]. This is given by Theorem 7 below. For the specific full cooperation case. Theorem 7 reduces 
to [11, Theorems 5.4]. As a result. Theorem 7 always includes the convex hull of the two regions [11, Theorems 
5.1 and 5.4] generalized to arbitrary functions, and this inclusion is strict in certain cases. 

Theorem 7 (Inner Bound - Rate Distortion). {Ro,Rx,Ry) is achievable with distortions Di and D2 whenever 

Ro>I{X;U\Y) 
Rx> I{V;X\T,W) 
Ry > I{U,Y;W\V,T) 
Rx+Ry> I{X, Y; V, T, W) + I{U; W\V, X, T, Y) 

for some T, U, V, and W with alphabets T, U, V, and W, respectively, that satisfy 

T-U-X-Y 



V-{X,T)-{U,Y)-W, 



and if there exist functions gi{V,T, W) and 5(2 (V,T, W) such that 



{1,2}. 



with cardinality bounds 



in < iA'i+4 



|V| < + l 



m<\u\-\y\ + i. 



To obtain the general inner bound [11, Theorem 5.1] it suffices to let T = [/ in Theorem 7. To obtain the specific 
full cooperation inner boimd [11, Theorem 5.4], it suffices to let f/ = X and let V he a constant in Theorem 7. 
Hence, Theorem 7 always includes the convex hull of the two schemes [11, Theorems 5.1 and 5.4]. The following 
two examples show that this inclusion is strict in certain cases. 

In the first example one of the distortion functions is defined on both sources X and Y, while in the second 
example the distortion functions are defined on each sources separately, as considered by Kaspi and Berger (see 
[11, Section II]). 

Example 3. Let {X = (Xi, X2), F) where Xi and Y are uniformly distributed over {1, 2, 3} and X2 is a Bem(p) 
random variable with p < |. Random variables Xi, X2, and Y are supposed to be independent. Define the binary 
function f{Xi,Y) to be equal to 1 whenever Xi =Y and equal to otherwise. The goal is to reconstruct f{Xi,Y) 
with average Hamming distance equal to zero (i.e., Di = 0) and X2 with average Hamming distance D2 < p. 
For any value of Rq, the achievable scheme [11, Theorem 5.1] gives 



To see this note that the achievable scheme that yields [1 1, Theorem 5.1] is so that whatever transmitter-X sends to 
transmitter-F will be retransmitted to the receiver. Therefore, the sum rate is at least as large as the point-to-point 
rate distortion problem where the transmitter has access to X and the receiver, who has access to Y, wants to recover 
f(Xi,Y) and X2 with distortions and D2, respectively. For the point-to-point case, due to the independence of 
{Xi,Y) and X2, the infimum of sum rate is at least 



Here Ro{f{Xi,Y)) is the infimum of number of bits for recovering f{Xi, Y) with zero distortion, which is equal 
to H{Xi) due to [16, Theorem 2], and R2 is the infimum of number of bits for recovering X2 with distortion 
D2 < p and is equal to -ffb(p) — Hb{d) by [3, Theorem 10.3.1]. Inequality (14) then follows. 



Rx+Ry> H{Xt) + Hb{p) - Hb{d). 



(14) 



Ro{f{X,,Y)) + R2. 



Now, for the scheme [11, Theorem 5.4] the infimum of sum rate for Rq > H{X\Y) is 

Rx + Ry= H{f{Xi,Y))+Ht,ip) - Hhid). (15) 

Therefore, from (14) and (15) the time shiiring of [11, Theorems 5.1] and [11, Theorems 5.4] gives 

Rx + RY>q- H{f{Xi, Y)) + {l-q)- H{X^) + H^ip) - Ht,{d) (16) 

for q e [0, 1]. To have an average cooperation at most equal to Rq, the time-sharing constant q should be less than 
H(x'\Y) ■ because the scheme [11, Theorem 5.4] needs, on average, more than H{X\Y) cooperation bits. 

Now, since H{f{Xi,Y)) < H{Xi), the bigger q is, the smaller the right-hand side of (16), and therefore 

+ - H^\Y) ■ ^(-^(^1' ^)) + - H^\Y)) ■ "^^'^ + ^'^^ ~ ^'^"^^ ■ ^^^^ 
We now turn to Theorem 7. By letting U = Xi, T he a constant, W = f{Xi,Y), and V = Bern(^f^) with^ 

Pv\x,y{V\X,Y) =py\x^[V\X2) = — TTTT , 

where Z = Bern((i), Theorem 7 gives for Rq > H{Xi\Y) the sum rate 

Rx+Ry = H{f{Xi, Y)) + H^{p) - H^{d), (18) 

which can be checked to be strictly below the right-hand side of (17) for H[Xx\Y) < Rq < H{X\Y). 

Example 4. Let X and Y be random variables taking values in {—1,0, -1-1} with probabilities 



p{x,y) = < 



if {x,y) = (-1,+1) or {x,y) = (+1,-1), 



I otherwise. 



Define the distortion function 



di{x, x) = < 



1 if a; • sign(x) = —1, 
otherwise. 



where 

+1 if X > 0, 
sign(i) = -1 if X < 0, 
if i = 0, 

Define d^^y-, y) as d\. We consider the rate region for distortion pairs (Di, D2) = (0, 0). 



'We use © to denote the sum modulo 2. 



We claim that 
1. for any value of i?o in [11, Theorem 5.1] 



Rx+Ry > 1.03; 



(19) 



2. the infimum of sum rate in [11, Theorem 5.4] under full cooperation Rq > H{X\Y) = 1.25 is 



Rx+Ry = 0.85; 



(20) 



3. from Theorem 7 it is possible to achieve for any Rq > 0.38, the sum rate 



Rx + Ry = 0.85 . 



(21) 



From 1. and 2. it can be concluded that any time sharing of the schemes [11, Theorem 5.1] and [11, Theorem 
5.4] that achieves Ro = 0.39, yields a sum rate bigger than 0.89, which is larger than the sum rate achieved by 
Theorem 7. 

The proofs of Claims 1.-3. are deferred to the Appendix. 



Proof of Theorem 1: Pick T, U, V, and W as in the theorem. These random variables together with {X,Y) 
are distributed according to some distribution p{v,x^t,u,y^w). 

The coding procedure consists of two phases. In the first phase transmitter-X sends (T(X), U(T(X))) to 
transmitter-y. In the second phase, both transmitters send T(X) to the receiver In addition to this message, 
transmitter-X and transmitter-F send V(X, T(X))) and W(Y, T(X), U(T(X))), respectively, to the receiver. As 
can be seen, only part of the message sent from transmitter-X to transmitter-F, T(X), is retransmitted from both 
transmitters to the receiver while for the other part, U(T(X)), a function of it W(Y, T(X), U(T(X))) is sent by 
transmitter-F to the receiver. Details follow. 

For t gT, V & ^{Gt,x\t.u,y)^ ™d w e ^{Gt,u,y\t,v)^ define /(w, t, w) to be equal to /(x, y) for all {t, x) G v 
and {t, u,y) G w such that p{x, t, u, y) > 0. Further, for t = (ti, . . . , v = {v\, . . . , Vn), and w = {wi, . . . , w„) 
let 



IV. Analysis 



/(v, t, w) = f{vi,ti,Wi), f{Vn, t„, Wn) ■ 



Generate 2"^^^'^^ sequences 




i G {1, 2, . . . , 2"^(^;^)}, i.i.d. according to the marginal distribution p{t). 
For each codeword t^*^ generate 2^^^-^'^^'^^ sequences 



u 



o)(tW) = («(^'Hi«),4^)(iW),...,n; 



j e {1, 2, . . . , 2"^('^'^l^^}, i.i.d. according to the marginal distribution p{u\t), and randomly bin each sequence 
(t('),u(^)(t(*))) uniformly into 2"^° bins. Similarly, generate 2"^(^;'^l^) and 2"'^^^'^''^\'^^ sequences 

v«(t«) = (.«(4^)),.f(i«),...,.W(t«)), 

and 

wW(tW) = (^.W(4^)),^.«(t«),...,t«(0(tW)), 

respectively, i.i.d. according to p{v\t) and p{w\t), respectively, and randomly and uniformly bin each sequence 
(t(»), v('=)(t(*))) and (t('\w(')(t(*))) into 2"-"^ and 2"^^ bins, respectively. Reveal the bin assignment (/)o to both 
the encoders and the bin assigrmients (j)x and <j)Y to the encoders and the decoder. 
Encoding 

First phase: Transmitter-X tries to find a sequence (t,u(t)) that is jointly typical with x, i.e.,^ (t,u(t),x) e 
A'"J?} {T,U, X) and sends the index of the bin that contains this sequence, i.e., (j)o(t,u(t)) '= qo, to transmitter-K. 

Second phase: Transmitter-X tries to find a unique v(t) that is jointly typical with (x, t), i.e., (v(t),x, t) S 
A''J^\v,X,T) and sends the index of the bin that contains (t,v(t)), i.e., (j)x{t,v{t)) =^ qx, to the receiver. 

Transmitter-!" upon receiving the index qa, first tries to find a unique (t, ii(t)) such that (t, ii(t) , y) € .A^?'' (T, U, Y) 
and such that 0o(t,u(t)) = qo. Then, it tries to find a unique w(t) that is jointly typical with (ii(t),y), i.e., 
(w(t),u(t),y) e A^J^\w,U,Y) and sends the index of the bin that contains (t,w(t)), i.e., qy = <^Y(t,w(t)), to 
the receiver 

If a transmitter cannot find an index as above, it declares an error, and if there is more than one index, the 
transmitter selects one of them randomly and uniformly. 

Decoding: Given the index pair {qx , qy), declare /(v(t), t, w(t)) if there exists a unique jointly typical (v, t, w) e 
A't\v,T,W) such that (/)x(t,v(t)) = qx and (/)Y(t,w(t)) = qy, and such that /(v(t),t, w(t)) is defined. 
Otherwise declare an error. 

Probability of Error: In each of the two phases there are two types of error. 

In the first phase, the first type of error occurs when no (t,u(t)) is jointly typical with x. The probability of 
this error is negligible for n large enough, due to the covering lemma (Lemma 5 in the second Appendix). 

The second type of error occurs if (t,ii(t)) ^ (t,u(t)). By symmetry of the scheme, this error probability, is 
the same as the average error probability conditioned on the transmitter-X selecting T*^^) and U'^^^(T'^^)). So, we 
consider the error event 

S' = {(t,U(T)) ^ (T«,uW(T(i)))}. (22) 
^^e"' {X, Y) is the set of jointly e-typical n-sequences. See the second Appendix for more details. 



Define the following events 

£'.^. {(t«,u(-'')(tW)) g A^\t,u,y), 

0o(TW,UW(TW)) = go}. 

Hence we have 

p(o=p(^;riu(U u(U4i)u( u si^)) 

<p(f;^i)+5]p(f;,,.)+E^(^M)+ E Pi^lj)- (23) 

According to the properties of jointly typical sequences (Lemmas 1, 2, 3, and 4), for any e' > e" > we have 

• P{£i i) < 6'{e' ,e") due to the encoding process and the Markov chain T — U — X — Y; 

• for 3 ^ 1, 



• for i ^ 1, 



• for i ^ 1 and j 1, 



P{£'ij) < 2""(^(^'^'^)"''3('^'))) . 2-"^0; 



where S'{e',e"), S[{s'), S2{s'), and S'^{e') tend to zero as s' tends to zero. 
Using the above bounds, the probability of error in (23) can be bounded as 

P{£') <^'(e',£") + 2"^(^'^l^) X 2""(^(^''^l^)"''i(''')) • 2""^° 

_|_ 2n-f(T;X) . 2-'^(.HT'U;Y)-6'2{e')) . 2-n-Ro 
_|_ 2nI{U,T;X) . 2-'niHT,U;Y)-6's{e'))) . 2-"fio 

Hence the error probability can be made to vanish whenever n tends to infinity as long as 

Ro > I{X; U, T) - I{U, T- Y) = I{X- U, T\Y) = I{X- U\Y), (24) 

where the equalities are due to the Markov chain T — U — X — Y. 

In the second phase, the first type of error occurs when no v(t), respectively no w(t), is jointly typical with 
(x, t), respectively with (u(t),y). The probability of each of these two errors is negUgible for n large enough, 
due to the covering lemma (Lenmia 5 in the second Appendix). Hence, the probabihty of the first type of error is 
neghgible. 

The second type of error refers to the Slepian-Wolf coding procedure. By symmetry of the scheme, the average 
error probability of the Slepian-Wolf coding procedure, is the same as the average error probabihty conditioned on 



the transmitters selecting T^^), U(i)(T(i)) ,V(i)(T(i)) and W(^)(T(i)). Note that if the transmitted messages are 
decoded correctly at the decoder, then there is no error due to the Claim 3.b. of Lermna 1 and the definitions of 
T, V, W, and f{V,T,W). 
We now consider the error event 

£ = {(t, V(t), W(t)) ^ (t(i), V(i)(t(i)), W(i)(T^'^))}. (25) 

and assume that the transmitters selected T^^), U(i)(T(i)) ,V(i)(T(i)) and W(i)(T(i)). 
Define the following events, 

£i,k,i = {(tW,vW(tW),w('Ht('^)) e Ai''\T,V,W), 

(.^x(TW,V('=)(T«)),.^y(TW,W«(TW)) = (g^.^r)}. 

We have 

P(5)=P(£:f,i,iU(U^i,M)U(U^M,OU( U U 

<m,i,i)+E^('^i.M) + E^(^i.i'')+ E Pi£w)+ P(^i,k,l). (26) 

fc^l 1^1 i^l,k,l 

According to the properties of jointly typical sequences (Lemmas 1, 2, 3, and 4), for any e > e' > we have 
. /'(f f 1 1) < S{e, e') due to the encoding process and the Markov chain V - (T, X) - {U, Y) - W; 
. for /c ^ 1, 

P{£i ki)< 2-"(-^(^;^l^)-''i('^)) • 2""^-^; 



for / ^ 1, 



for fc 7^ 1, / ^ 1, 



• for i ^ 1, 



P{£i ki) < 2-"(^(^'^l'^)-'^3(e))) . 2-"«^ . 2""^^; 
P{£i ki) < 2~"(-'^(^'^l'^^~**(^^^ • 2~"^^ ■ 2~"^^- 



where d{e,e'), Si{e), ^2(e). 53{s), and 64(5) tend to zero as e tends to zero. 



Using the above bounds, the probability of error in (26) can be bounded as 

_|_ 2nI{U,Y;W\T) . 2-"(-f(V;VK|T)-52(£)) . 2-"-Rr 

_|_ 2nI{V;X\T) . 2'^HU,Y;W\T) . 2-"(-f (T^;W|T)-53(£)) . 2-"-Rx . 2-"-Ri' 

_l_ 2nJ'(^;T) . 2"^(^;^|r) . 2nJ'(i^,'^';M^|r) . 2-n{i{V;W\T)-64is)) . 2-"-Rx . 2-"-R>' 

Hence the error probability goes to zero whenever n goes to infinity and inequalities (1) are satisfied. 

We now prove that for calculating the rate region it is sufficient to consider random variables with cardinality 
bounds (4). Suppose {V, X, T, U, Y, W) ~ p{v, x, t, u, y, w) satisfies (2) and (3). 
Cardinality of T: We want to bound the cardinaUty of T by \X\ + 4. Suppose that |T| > \X\ + 5. 

We keep p{v,x,u,y,iu\t) unchanged. The Markov chains (2) concludes that p{y,x\u), and p{w\y,u,x) remain 
also imchanged. This guarantees that the Markov chains (2) hold for any new probability distribution p'{t),t G T. 
Then we assign a new probabiUty distribution p' {t),t G T, such that p' {t) = for at least |T| — ( | Af] +4) elements and 
remove these elements from T. The cardinality is now at most lAf] +4. We choose this new probabiUty distribution 
in a way that p{x, y) and the right-hand sides of (1) remain unchanged. This guarantees that the achievable rate 
region remains unchanged. Note that since the support sets of the new random variables are subsets of the previous 
support sets, the conditions (3) are satisfied. We now show how to choose a new probabiUty distribution with the 
desired characteristics. 

The new probabiUty distribution p'{t) should satisfy 

Y^p'{t)^l. (27) 
ter 

To keep p{x, y) unchanged we keep p{x) and p{y\x). For this, p'{t) should satisfy 

^K^ = i|t)-p'(0=p(a; = i)|p(t) l<i<\X\-l, (28) 
ter 

where p{x = i)\p(^t) is the original distribution of X. 
Consider the right-hand side of the first term in (1): 

I{X;U\Y)\,^t)=H{X\Y)\,^,)-H{X\U,Y)\^^t). 

If p(x,y) remains unchanged, so remains H{X\Y). Hence, for keeping the value of /(X; t/|y)|p(t), we should 
keep the value of H{X\U,Y)\p(^fy i.e., we should have 

J2afP'{t) = b, (29) 

ter 

where at = J2H{X\U = u,Y = y)p{y\u)p{u\t) and b = H{X\U,Y)\p^ty Note that the a^s do not depend on 
p'{t). 



Similarly, for keeping the right-hand sides of the other terms in (1), p'{t) should satisfy the set of linear equations 

^ I{V; X\W, T = t)- p'it) = I{V; X\T, W)\p^t^, (30) 
ter 

^ I{U, Y; W\V, T = t)- p'{t) = I{U, Y; W\V, T)|p(t), (31) 
ter 

J2iH{X, Y\V,T = t, W) + I{U; W\V, X,T = t, W)) ■ p'{t) 
teT 

= {H{X, Y\V, T, W) + I{U; W\V, X, T, W))\p^t). (32) 

Combining, we deduce that the distribution p'{t) should satisfy the set of m = \X\ + A Unear equations (27)-(32). 
We write these equations in the matrix form 

^nxm X Zjnxl — -Bjixl- (33) 

where n = |T|, where Z denotes the vector of p'{t),t e T, where A denotes the matrix of coefficients (constants 
on the left-hand side of (27)-(32)), and B the vector of constants on the right-hand side of equations (27)-(32). 

We want to find a positive solution of Z in the above equation where Zi = for at least n — m indices 1 <i <n. 
We find such a solution recursively, i.e., we show that if n > m, then we can find a solution S which has at least 
one zero entry, say Si. Then, we set n := n — 1, remove the corresponding column of A and corresponding row 
of Z and repeat the procedure. 

We now show that if n > m, then there exists a non-negative solution for (33) with at least one zero entry. 
We know that (33) has at least one non-negative solution, which is the vector p{t). Therefore, if we find another 
solution with at least one negative entry then, since the space solution of (33) is convex, there exists a solution 
with at least one zero entry. 

Since n> m, there exists a column which is a linear combination of the other columns. Without loss of generality, 

m— 1 

suppose Ajn = J2 (^i^i' where Ai is the i-th column of A. Now, if Z = [Zi, - ■ ■ , Z^]^ is a non-negative solution, 

i=l 

then 

Z' = [Zi + c- ai,Z2 + c- a2,- ■ ■ , Zm-i + c ■ am-i,Zk - c]^, 

is also a solution for any value of c. By a suitable choice of c, Zk — c is negative which completes the proof. 
Cardinalities of V and W: To bound the cardinaUties of V and W by |T| x | Af] -|- 1 and \U\ x 13^1 -|- 1, respectively, 
one proceeds as in [17, Proof of Theorem 1] by means of Caratheodory's theorem. ■ 
Proof of Theorem 2 : For achievabihty it suffices to letT=U and V = X in Theorem 1. 
Now for the converse. Let Co = ¥'o(X) be the message received by transmitter-y and let Cx = and 
Cy = yr(Co,Y) be the received messages at the receiver from transmitter-X and transmitter-F respectively. 
Suppose that 

P(V(Cx,C7y)^/(X,Y))<4, 



where ^ when n — )• oo. Using Fano's inequaUty we have 

H{f{X,Y)\Cx,CY)<en 

where e„ ^ when n — >■ oo. 

We start by showing that the Markov chain 

f{Xi, Yi) - (Co, , Yr\ Cy) - Cx (34) 

holds. We have 

K/(^i>2/i)|co,a;",yr\cr,cx) = ^p{f{xi,yi)\cQ,x1,y'^'^,CY,cx)-p{x]^'^\co,x'^,y\~'^,CY,cx) 

^1 

- X] P(fi^i' 2/i)|co, a;", ^/^^ Cy) • p{x\~^\co, x,", t/^S cy , cx) 
= ^p(a;i"^|co,a;^,2/r\cy,cx) • ^^p{^{xi,yi)\xi,yi) ■ v{x)i\ai,x\,y'--^^ ,cy) 
= ^ p{x'i~^\co,x'^ ,y\~^ ,cy ,cx) ■ ^p{f{xi,yi)\xi,yi) ■ p{yi\co,Xi,yl~^ ,cy) 
= ^P{f{xt,yi)\co,x'l,yl^\cY) ■ p{x\^^\co,x'!l,yl~\cY,cx) 

= Pifixt,yi)\co,x''l,y{''^,CY), 

where (a) is due to the fact that Cx is a function of . This gives the desired Markov chain. 

Now, by taking Ui = {Co, Xf^^, F/"^}, Wi = {Cy, V/"^}, the Markov chains Ui-X—Yi and Xi-{Ui,Yi}-Wi 
hold and 

H{f{Xi, Yi}\Xi, Ui, Wi) ^^H{f{Xi, Yi)\Xi, U,, Wi, Cx) 
<H{f{Xi,Yi)\CY,Cx) 
<H{f{^,Y)\CY,Cx) 



where (a) is true due to the Markov chain (34). 



Then, we have 



nRo > logs I Co I 

> H{Co) 
>7(Co;Xi"|Fr) 

n 

> Y,H{Xi\Yi) - H{Xi\Yr\Co,X^+„Y,] 



and 



= J2l{^i;Ui\Y^ (35) 

nRx>log2\Cx\ 

> H{Cx) 

>7(Cx;Xi"|Co,Cr) 

> ir(Xi"|Co,Cr)-£ 

n 

> J2 H{Xi\X^+„ Co, Cy, Yt') - e 

i=l 

n 

= J2H{Xi\Ui,Wi)-e (36) 

where (a) comes from the fact that X can be recovered knowing {Cx,Cy), since the function f{X, Y) is partially 
invertible with respect to X. Further, 

nRy > log2 \Cy\ 
> H{Cy) 

>I{Cy;Y^\Co,X^) 

n 

= Y,[H{Yi\X^^^,Co,Yt^) - H{nX^^^,Co,Yt\CY,Xi)] 

i=l 
n 

= Y,I{yi;Wi\Xi,Ui). (37) 

i=l 



and 

n{Rx + Ry) > log2 \{Cx,Cy)\ > H{Cx,Cy) 
>I{Cx,Cy;X^,Y{^) 

> H{xn + H{Y,-\X^n - H{Y,"\X^, Cx,Cy) - e 

= H{X^) + H{Y,^\X^, Co) - Co, Cx, Cy) - s 

n 

> Y}H{Xi) + H{Yi\X^^„Co, Yr\Xi) - H{Yi\Yt\X^^„Co, Cy, X^)] - e 

n 

= J2iH{Xi) + I{Yi, Wi\Xi, Ui)] - e (38) 

i=l 

where (a) comes from the fact that X can be recovered knowing {Cx, Cy)- 
Let Q be a uniform random variable over {1,2, - ■ ■ ,n}. Let 

X = Xq 

Y = Yq 
U={Uq,Q) 
W = {Wq,Q). 

Note that knowing U or W, one knows Q. 

In the remaining part of the proof, we first show that U and W satisfy the inequalities of the theorem, the Markov 
chains (5), and the equality 

H{f{X,Y)\X,U,W)=0. (39) 

Then, based on W, we introduce a new random variable W' such that U and W' satisfy the inequalities of the 
theorem as well as the Markov chains (5) and the relation (6), which completes the proof. 
We start by showing that U and W satisfy the Markov chains 

U-X-Y 

X-{U,Y)-W. (40) 

For the first Markov chain, we have 

H{Y\X,U) = J2 -H{Y,\X„U„q) Yl -Hi^M = H{Y\X), 

9=1 g=l 

where (a) is due to the Markov chain Uq — X^ — Y^. 



For the second Markov chain, we have 

I{X; W\U, Y) --^(^9; = 0' 

9=1'' 

where (a) is due to the Markov chain Xq — {Uq,Yq) — Wq. 
Equation (39) holds due to 

n ^ 

H{f{X,Y)\X,U,W)=J2-HifiXq,Yq)\Xq,Uq,Wq)<Sn 
9=1 

and the fact that e„ can be chosen arbitrarily small. 
Finally to show that U and W satisfy the inequalitites of the theorem, consider the following equalities 

1 " 

I{X;U\Y) = -Y,IiX,;Uq\Yq) 

n 

H{X\U, W)=-Y, H{Xq\Uq, Wq) 
9=1 
1 " 

I{Y; W\X, C/) = - E ^(^«; ^"1^1^ 
""9=1 

[H{X) + I{Y, W\X, U)=^f2 ^(^9) + -^(^9' 

9=1 

This, together with (35), (36), (37), and (38) shows that U and W satisfy the inequaUties of the theorem. 

Until here we have shown that U and W satisfy the inequalities of the theorem, the Markov chains (5), and the 
equation (39). 

The last step consists in defining a new random variable W such that U and W satisfy the inequalities of the 
theorem, the Markov chains (5), and equaUty (6), which completes the proof. To do this we need the following 
definition. 

Deiuiition 6 (Support set of a random variable). [17, Definition 6] Let {V, X) ~ p{v, x) where F is a random 
variable taking values in some countable set V = {vi,V2, ■ ■■}■ The support set of X with respect to V is the 
random variable Sx {Y) defined as 

Sx{vj) = {j, Sj = {x : p{vj,x) > 0}) Vj gV . 

Moreover, random variable S is defined as 

S = Sj 4^ V = Vj j = 1, 2, . . . 

Note that V and Sx{V) are in one-to-one correspondence by definition. In the sequel, with a slight abuse of 
notation we write Z e Sx{V) whenever Z G S and write Sx{V) G A whenever S G A . 



Let W = S(^u^Y){W). According to Definition 6 and relations (39) and (40), U and W satisfy 

X - {U, Y) - W 
H{f{X,Y)\X,U,W') = (41) 

and the inequalities of theorem. To conclude the proof it remains to show that 

{U,Y) eW eM{T{Gu,Y\x,u))- 

That (U,Y) e W follows directly from the fact that W = S(u.Y)iW)- We show that W e M(r(G[/,y|x,c/)) by 
contradiction. Suppose that w' E W' is not an independent set in Gjj yix.u- Notice that for any Ui,Uj E U, with 
Ui ^ Uj, and jji, yj € y, {ui,yi) and {yj,yj) are not connected in Gu^y\x,u- Hence, there exists some u & U and 
yi,yj e y such that (tt,2/i), G w', i.e., 

p{u, yi,w') ■ p{u, yj,w') > . (42) 

Now, {u,yi) and {u,yj) are connected in Gu^y\x,u- This means that there exists some x £ X such that 

p{x,u,yi)-p{x,u,yj) > 

f{x,yi)y^f{x,y,). (43) 

The relations (42), (43) and the Markov chain X - {U, Y) - W, imply that 

p{x, u, w') > 
p{yi\x,u,w') ■p{yj\x,u,'w') > 
f{x,yi) ¥= f{x,yj)- 

From these relations one concludes that 

H{f{X, Y)\X, U, W) > H{f{X, Y)\X = x,U = u, W = w') ■ p{x, u, w') > 0, 

which contradicts (41). ■ 
Proof of Theorem 3 : 

Achievablity: Letting U ^ X, V ^ Constant, W = f{X, Y), and using the Markov chain T - X - f{X, Y), 
gives the desired result. Note that in this case the cardinality bound can be tightened using Caratheodory's theorem 
as in [17, Proof of Theorem 1]. 

Converse: Let Co = foOQ denote the message sent by transmitter-X to transmitter-y. Let Cx = fxOQ and 
Cy — iy9i'(Co, Y) be the messages sent by the transmitters to the receiver. Further, suppose that 

P(V(Cx,Cy)^/(X,Y))<4, 



where ^ when n — )• oo. Using Fano's inequaUty we have 

H{f{X,Y)\Cx,CY)<en, 

where e„ ^ when n — >■ oo. 

Letting Tj = (/(^f+i, l^+i), Cx) the Markov chain Ti-Xt- Yi holds. Moreover, we have 

nRy > log2 \Cy\ > H{Cy) 
>I{CyJ{X^,Y{^)\Cx) 

n 

> J2H{f{Xi,Yi)\f{X-^„Y,li),Cx)-s 

i=l 

n 

= J2H{f{X,,Yi)\T,)-e, (44) 

and 

n{Rx + Ry) > log2 \Cx,Cy\ > H{Cx,Cy) 
= I{Cx,CY;X^,f{X^,Y{^)) 

= I{Cx,Cy; f{X^, IT)) + I{Cx, Cy; X^\f{X^, Y^) 
> H{f{X^,Yn) + I{Cx,CY;X^\f{X^,Yn) - e 

n 

= Y,H{f{Xi,Yi}) + H{Xi\f{Xi,Yi)) - H{Xi\Xi-\f{X^,Y,"),Cx,CY)-s 

i=l 



> ^H{f{Xi,Y,)) + H{Xi\f{Xi,Yi)) - H{Xi\Cx,f{Xr+„Y,l,),f{X,,Yi))-e 

i=l 

n 

= H{f{Xi, Y,)) + I{X,; f{X^^„Y;i,), Cx\f{X,, Y,)) - e 

i=l 

n 

= H{f{X,, y,)) + I{X,] T,\f{X,, Yi)) - e. (45) 

i=l 

Let Q be a uniform random variable over {1,2,-- - ,n). Let 

X=Xq 
Y = Yq 
T={Tq,Q) 

Since the knowledge of T gives Q we have 

H{Y\X,T) = Y -H{Y,\X„T„q) ^ -H{YM = H{Y\X) 

q=l q=l 



where (a) is due to the Markov chain Tg — Xq — Yq. Hence, the Markov chain 



T-X -Y (46) 



holds. 

Moreover, we have the following equahties 



n 

H{f{X,Y)\T) = -Y,H{f{Xq,Yq)\Tq) 
1 " 

H{f{X, Y)) + J(X; T\f{X, Y)) = H{f{Xq, Yq)) + I{Xq, Tq\f{Xq, Yq)). 

9=1 

This, together with (44), (45), and (46) completes the proof. ■ 
Proof of Theorem 5 : From the converse of [16, Theorem 3] we deduce that if a rate pair {Ro,Ry) is 
achievable, then there exist random variables U and W that satisfy (13) and 

U-X-Y 

X-{U,Y)-W 

H{f{X,Y)\X,U,W)=0. 

Finally, the same argument as the final argument of the converse proof of Theorem 2 shows that W = Su,y{W) 
satisfies the above relations, the inequahties of the theorem, and 

Y€W'€ T{Gu,Y\x,u)- 

The cardinality bounds for U and W can be derived using the same methods as used in the proof of Theorem 1 
for bounding the cardinalities of T and W, respectively. ■ 
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Appendix 

Proofs of Example 4 claims 1., 2., and 3. 
In this Appendix we prove the claims stated in Example 4. 
1. Suppose T' , V and W satisfy the conditions of [11, Theorem 5.1], i.e., 

T' -X-Y 



V' - {X,T') - {T',Y) - W 



(47) 



and that there exist functions gi{T', V, W) and g2{T', V, W) such that 

Edi{X,gi{V',T',W)) = 

Ed2{Y,g2{V',T',W)) = 0. (48) 

With this choice of auxiliary random variables T', V', and W the sum-rate constraint in [11, Theorem 5.1] 
becomes 

Rx+Ry>I{X,Y;V',T',W). (49) 
The minimum of the right-hand side of (49) over {T' ,V' ,W') can be restricted to the case where V is a 

def 

constant. To see this, replace T' hy T = (T' , V') and let F be a constant. The random variables T, V, and 
W satisfy (47) and (48) and give the same-rate constraint as in (49). 
We now want to find the minimum of 

I{X,Y;T,W) (50) 

for some T and W with 

\T\<7 
T-X-Y 

X-{T,Y)-W (51) 
and such that there exist functions gi{T, W) and g2{T, W) such that 

Edi(X,ffi(T,VF))=0 

m2{Y,g2{T,W))=Q. (52) 

Since [11, Theorem 5.1] is a special case of Theorem 7 with U = Constant, we can apply the cardinality 
bounds estabUshed in Theorem 7 and deduce that 

m<7 

|>V| < 5. 



that satisfy 



The minimum of (50) with the above cardinality boimds can in principle be numerically evaluated to obtain 
mm I (X,Y-,T,W) = 1.03. However, the number of degrees of freedom in the minimization still makes the 



problem intractable on a regular desktop computer. As it turns out, for the problem at hand the cardinality 
bound |W| < 5 can be tightened to |W| < 2, which then allows to obtain the desired minimum in a matter of 
seconds on a regular computer. 
2. The sum-rate constraint in [11, Theorem 5.4] is 



for some T and W that satisfy 



Rx+Ry> I{X,Y-T,W) 



T-X-Y, 



and such that there exist functions gi{W) and g2{W) such that 

E[di{X,gi{W))]=0 
nd2{Y,g2{W))]=0. 



(53) 



Since I{X,Y;T,W) > I{X,Y;W), by letting T be a constant decreases the sum-rate constraint. We now 
want to find the infimum I{X, Y; W) over W's that satisfy (53) for some gi(W) and 92 (VF). 
The distortion criteria (53) imply that for any w £ W with p{w) > 0, we should have 

P{W = w\X = -1) • P{W = w\X = +l)=0 
P{W = w\Y = -1) • P{W = w\Y = +1) = 0. 

Because of the symmetry of X and Y, I{X,Y;W) is minimized for the random variable W G {w-,w+} 
with probability distribution 



p{w_\x,y) 



lifx 



-1 ov y 



iif (x,t/) = (0,0) 



otherwise 



1 if X = or y = 

piw+\x,y) = <( i if {x,y) = (0,0) 
otherwise. 

This W satisfies the Markov chain and distortion criteria constraints of [11, Theorem 5.4] and gives 

1 



if{Rx + Ry) = I{X,Y;W) = 1 



0.85 



3. Let T,V be constants and U G {u_,u+} have the probabiUty distribution 



p{u-\x) = < 



1 if a; = -1 



if a; = 



otherwise 



piu+\x) = < 



1 if X = +1 



i if X = 



otherwise. 



Let W e {w-,w+} have the probabihty distribution 

p{w_\u,y) = < 



1 if {u,y) e {{u-,-l),{u-,0),{u+,-l)} 
otherwise 



p{w+\u,y) = < 



1 if {u,y) e {(u+,+l),(u+,0), («_,+!)}, 
otherwise. 



Random variables T, U, V, and W satisfy the Markov chains and distortion criteria of Theorem 7. These 
random variables give the sum rate 



with 



Rx+Ry> I{X,Y; W) = 1--= 0.85 



Ro > I{X;U\Y) = 0.38. 



Appendix 
Jointly Typical Sequences 

Let e X^y^. Define the empirical probability mass function (or the type) of as 

Let (X, Y) ~ p{x, y). The set of jointly e-typical n-sequences is defined as 

At\X,Y)'^{{x^,y^) : \iT,n,yn{x,y) - p{x,y)\ < e ■ p{x,y) for all (x,y) e {X,y)}. 
Also define the set of conditionally e-typical n sequences as 



(54) 



(55) 



(56) 



Jointly typical sequences satisfy the following properties: 

Lemma 1. [16, Corollary 2], [5, Page 27] 

1. Let ~ U.i=iPx,Y{xi,yi). Then 

P((X",y") e Ai"-\X,Y)) > 1 - S{s). 

2. (1 - 5(e))2"^(^'^)(i-^) < Y)\ < 2"^^(^.^)(i+-). 

3. Let = Ui=lPx,Y{x^,y^). Then, for each (a;",y") e 

a. a;" e ^i"^ (X) anJ y" e ^i"^ (F); 

b. f>x,r(a^i, j/i) > /or all 1 <i < n; 

^ 2-nH{X,Y)il+e) < p{^x'^^yn^ < 2-"-f^(^.'i')(l-e); 

Lemma 2 (Conditional Typicality Lemma). Let {X,Y) ~ p{x,y). Suppose that a;" G am/ 

p(y"|a;") = niLiPyi^lf^la^O- T^/^en, /or e > e' 

P((a;",F") e4"^(X,y)) > l-5(£,£')- 

Lemma 3 (Markov Lemma). [16, Lemma 23] Let X — Y — Z form a Markov chain. Suppose that € 
A''J?\x,Y) and Z"^ p{z''\y'') = U7=iPz\Y{zi\yz)- Then, for e > e' 

P((a;",y",Z") e4")(X,y,Z)) >l-6{6,s'). 

Lemma 4. [16, Corollary 4] Let iX,Y) ^ Px,Y{x,y) with marginal probability distributions px{x) and pviy)- 
Let (X',Y') ~ 117=1 Px {x'i) ■ PY {yd- Then, 

(1 - S{e)) ■ 2-"(^(^;i')+2-^^(^)) < P((X', Y') e Ai"\x,Y)) < 2-"W^;n-2e^^(>')) . 

Lemma 5 (Covering Lemma). [5, Lemma 3.3] Let {X,X) ~ p-^ -^(Xjx). Let X" ~ ]Xi=iPx{xi) and 

{i'"(m), m e B} wjY/j \B\ > 

foe a set of random sequences independent of each other and ofX^, each distributed according to nr=i^'x(^»(™))- 
Then, there exists 5{s) that tends to zero as e ^ such that 

lim P((X",X"(m)) 4 4")(X,l")/or all m e B) = 0, 

n—^oo 

if 

R> I{X-X) + 5{e). 



